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^S| . Abstract 

The logarithmic Sobolev inequality 9 for the Hamming cube {0, 1}" states that for any 
real-valued function / on the cube holds 

^^ . We show that the constant C = 2 at the right hand side of this inequality can be replaced 

by a function C(p) depending on p = ^ — j^L ' . The function C(-) is an increasing convex 

^ ; function taking [0, log 2] to [2, 2/ log 2] . 

C^ ■ We present some applications of this modified inequality. In particular, it is used to 

obtain a discrete version of the Faber-Krahn inequality for small subsets of the Hamming 

cube, answering a question of Friedman and Tillich [8]. We introduce, following [8j, the 

notion of a fractional edge-boundary size of a subset of {0, 1}", and show Hamming balls 

^ . of radius at most n/2 — $7 (n'^^'*) to be sets with (asymptotically) the smallest fractional 

^\ ' edge-boundary for their size. 

• ' 1 Introduction 

O 

Q ■ 1.1 Isoperimetric problems on the Hamming cube 

This paper deals with discrete isoperimetric inequalities on graphs. Let a graph G = (V, E) 
/\ ' be given, and let ^ C y be a set of vertices in G. An isoperimetric inequality addresses the 

j^ ■ question of how small the boundary dA of A can be, given the cardinality of A, by lower 

bounding the size of dA by an appropriate function of |A|. There are various ways do define 
and measure the boundary of the set, two salient examples being the vertex boundary of A, 
consisting of the vertices of A which have neighbors outside A, and the edge boundary of A, 
which is the set of edges crossing from A to its complement. In both these cases, the size of the 
boundary is the cardinality of the corresponding set of vertices (or edges). However, one can 
also think of examples of a somewhat different nature, some of which will be considered below. 

In this paper we deal with a specific graph - the Hamming cube {0, 1}". This is a graph 
with 2" vertices indexed by boolean strings of length n. Two vertices are connected by an edge 
if they differ only in one coordinate. The metric defined by this graph is called the Hamming 
distance. In other words, two vertices x and y are at distance d if they differ in d coordinates. 



Let us define two important families of subsets of {0, 1}". A Hamming ball is a ball in the 
Hamming metric. A subcube is a subset of the vertices obtained by fixing the value in some of 
the coordinates. The number of fixed coordinates is called the co-dimension of the subcube. It 
turns out [HI [ini [E] that a Hamming ball has the smallest vertex boundary for its size, and a 
subcube has the smallest edge boundary. 

We will consider several versions of the edge-isoperimetric inequality for the cube. Let 
\dA\ be the cardinality of the edge boundary of A, normalized, for convenience, by 2"~^. The 
standard version of the inequality [W\ IT2] states that for any subset A C {0, l}" we have|^ 

2 141 2" 

\d^\ > 1 — log^. (1) 

' ' - log2 2" ^ |A| ^ ' 

This is tight if A is a subcube of an arbitrary co-dimension < t < n. 

A logarithm,ic Sobolev inequality ^ establishes a relation between two (appropriately defined) 
notions: the variation of a function and its entropy. For a function / on the Hamming cube 
{0, 1}" endowed with the uniform measure, this translates to 

E. 5](/(x) - f{y)f >2-Ent (f) = 2 • {Ef log f - Ef log Ef) (2) 

yr^x 

It is useful to view ([2]) as an isoperimetric inequality. In particular ([7]), it implies a functional 
form of the edge-isoperimetric inequality ([T]). For a non-zero function / : {0, 1}" -^ M holds 

Ex ^(/(:c) - fiy)f >2.Ef log y- (3) 

Choosing / in (l3|) to be the characteristic function of a subset A of the cube, we recover the edge- 
isoperimetric inequality ([1]) with a somewhat worse constant on the right hand side, replacing 
2/ log 2 with 2. On the other hand, there are examples ([7]) of real- valued functions / for which 
the constant 2 in ([3]) is tight. 

Let us now briefly sketch the results in this paper, before discussing them in fuller detail 
in subsection 11.41 below. We will show that the constant C = 2 at the right hand side of 
the logarithmic Sobolev inequality ([2j) can be replaced by a function C{p) depending on p = 

- — ^^ ' . The function C{p) will be given explicitly. It is a convex function which increases 
from 2 to 2/ log 2 as p goes from to log 2. We will also observe that C{p) gives the correct 
dependence on p, describing functions on {0, 1}" for which the modified inequality is tight. 

This will imply a corresponding modification of the functional isoperimetric inequality ([3]). 
Here, as well as in ([2]), it will be possible to replace the constant C = 2 by the function C{p). 

The modified version of ([3]) is used to derive a discrete analogue of the classical Faber- 
Krahn inequality in M" for small subsets of the Hamming cube {0, 1}", answering^ a question 



^We use natural logarithms throughout the paper. 

^Up to an error which becomes negligible as the dimension of the cube grows. 



from [8]. We will introduce, following [8], the notion of a fractional edge-boundary size of a 
subset of {0, 1}", and show Hamming balls of radius at most n/2 — Q (n^'^) to be sets with 
(asymptotically) the smallest fractional edge-boundary for their size. 

The question in [S] is a part of an approach to obtain upper bounds on the cardinality of 
binary error-correcting codes. We will now take a brief detour to the theory of error-correcting 
codes in order to provide a natural framework in which this question can be discussed. 

1.2 Bounds on binary error correcting codes 

A binary error-correcting code of length n and minimal distance d is a subset C of the boolean 
cube {0, 1}" such that the distance between any two distinct points in C is at least d. In other 
words, the points in C can be taken as centers in a disjoint packing of Hamming balls of raduis 
r^l into {0,1}". 

The question of the maximal possible cardinality A{n, d) of such a packing is one of the 
central questions of coding theory. The best known upper bounds on A{n, d) were obtained in 
|17j following Delsarte's linear programming approach [6j. The analysis in [17] uses theory of 
orthogonal polynomials and is somewhat complicated. 

A different approach to obtain some of the bounds in |17j was presented in [8]. The appeal 
of this new approach is in showing the possibility to work with Delsarte's linear inequalities 
without resorting to language and tools of orthogonal polynomial theory. In particular, ^ 
establishes a connection between packing bounds and isoperimetric questions in the Hamming 
cube. To describe this connection, we need a notion of the fractional edge boundary size of a 
subset of the cube q 

For A C {0, 1}", the fractional edge boundary size of A is defined as 

\d*A\ = min I E,. Y.{f{x) - f{y)f : supp(/) C A, E/^ = M I (4) 

The right hand side of this definition computes a minimum of the variation of / over a certain 
family of functions. Note that this family contains the characteristic function of A whose 
variation equals the (normalized) cardinality of the edge boundary dA. Consequently, the 
fractional boundary of A is at most as large as its boundary. 

The following result is proved in [181 [T9] , following the approach of [8] • Let us mention that 
this claim was proved in [8] for an important special case of linear codes. 

Theorem 1.1: Let A be a subset of the boolean cube {0, 1}" such that 

\A\ 



\d*A\ < (2d + l) 



2n-l 

Let C be a binary error- correcting code with minimal distance d. Then 

\C\ <n\A\ 



^This is a reformulation of a closely related notion of the maximal eigenvalue of a subset introduced in [8]. 



This suggests a way to obtain upper bounds for codes by finding subsets of the cube with a 
small fractional boundary. Natural candidates to try are the isoperimetric sets, that is Hamming 
balls and subcubes. Their fractional boundaries are analyzed in [8j. It turns out that among 
these two options, a Hamming ball has the smaller fractional edge boundary. Note that this 
pinpoints an intriguing difference between the notions of the fractional edge-boundary and that 
of the 'ordinary' edge-boundary, for which the subcubes are the optimal sets. 

Let B denote a Hamming ball of radius r. It is shown in [8] that 

\d*B\ < 4 (^ - ^^(^T^ + o(n)) • M (5) 

Combined with Theorem II. 1^ this shows that a binary error-correcting code with minimal 
distance d is at most as large, up to negligible multiplicative factors, as a Hamming ball of 
radius r = n/2 — y^d{n — d). This provides an alternative proof of the first linear programming 
bound for binary codes [17] . 



This concludes our detour into coding theory. We are now ready to state the isoperimetric 
problem of [8j. 

1.3 An isoperimetric problem for the Hamming cube 

In order to obtain the best possible bounds on codes via Theorem I !.!( we need to find subsets 
of the Hamming cube with the smallest possible fractional edge-boundary. In particular, an 
existence of subsets whose fractional boundary is noticeably smaller than that of Hamming balls 
of the same cardinality, would imply an improvement on the best currently known bounds. This 
naturally leads to the following questions [8j 

A fractional edge-isoperimetric problem for the Hamming cube: 

• What is the smallest possible fractional boundary of a subset of {0, l}" of a given cardi- 
nality? 

• Which sets have the smallest fractional boundaries? 

These questions were the starting point of our investigation. Before describing our results, 
let us mention a connection to the classical Faber-Krahn inequality in M", as pointed out in [8]. 

First, here is a brief description of the Euclidean space inequality, following [5j. For an open 
set il in M", consider the functional 

_ Wgrad m 

where cj) ranges over smooth functions supported in $7, and the associated infimum A*(r2) = 
inf^F[4>]. 

A*(r2) is referred to as the fundamental tone of 0,. The Faber-Krahn inequality states that 
among all sets fi of the same measure, Euclidean ball has the minimal fundamental tone. 



In the discrete setting of the Hamming cube, a reasonable interpretation of ([6]) is to consider 
the functional 

Pif\ = ^ — Ep 

where / ranges over functions supported in a subset A of {0, 1}". In our terminology, the 
"fundamental tone" of A is 

on 

X*(A) = minF[/l = —- ■ \d*A\ 
f \A\ ' ' 

Hence, the set with the smallest fractional boundary for its size has the minimal fundamental 
tone, and vice versa. 

Following [8] we will refer to the fractional edge-isoperimetric problem as the discrete Faber- 
Krahn problem for the Hamming cube. 

1.4 Main results 

Our main technical result is a modified version of the logarithmic Sobolev inequality ([2]). Let 
H{x) = — j;logx — (1 — x) log(l — x) be the "natural" (i.e., using natural logarithms) entropy 
function. 

Theorem 1.2: 

Ent( f^) 

• Let f : {0, 1}" -^M. be a non-zero function, and let p = - — wh—^- Then 

Ex 5^(/(x) - /(y))' > C{p) ■ Ent {f) , (7) 

where 



C(x) = ^- l^-^H-^{\og2-x)(l-H~^{\og2-x) 

• The function C(-) is an increasing convex function, taking [0,log2] to [2, 2/ log 2]. 

Inequality ([7]) is tight in the following sense: for each p £ [0, log 2] there exists a non-constant 
function f = fp such that Ent (/^) > pnKf'^ and 

E. ^(/(:r) - fiy)f < (1 + o,,(l)) • C(p) • Ent [f^) (8) 

This follows from the tightness of inequality (llOp below (see the second part of Theorem II. 4p , 
since that inequality is a corollary of d?]). The fact that (jlOp is tight for Hamming balls follows 
from ([5]). The functions fp are the minimal variation functions supported on a Hamming ball 
of an appropriate radius. They are constructed explicitly in [8j. 

Theorem 11.21 together with the observation Ent i^f"^^ > E/^ log jg^trr ([7j), implies a corre- 
sponding modification of the functional isoperimetric inequality ([3]) . 



Corollary 1.3: 

E. E(/(^) - /(2/))' > C{p) • Kf log, g- (9) 



E/2 



j/~a; 



E2|/| 



where p = ^ log jgsjTT . Inequality ([9]) is tight in the same sense and for the same reasons ([8]) is 
tight. 

Let us briefly discuss this inequality. It implies, in particular, that as the ratio ^3-, grows 
(the function / becomes less "flat") its edge-isoperimetric constant approaches the isoperimetric 
constant C = ^^^ in the edge-isoperimetric inequality 1^ for 0-1 functions. One possible partial 
explanation for this phenomenon is that, for functions supported on a small set A C {0, 1}", 
the main contribution to the variation E^: Ylyr^xif(^) ~ fiv))"^ ^^ likely to come from edges (x, y) 
which belong to the edge-boundary of A. 

It is now straightforward to derive the fractional edge-isoperimetric inequality (jlOp from 
Corollary II. 3[ Let / be a function supported on a subset A of {0, 1}"". Then, by the Cauchy- 
Schwarz inequality, ^-f > |-j| . Since the function C(-) is monotone, we have 

-log--j-log-^-E/2 

Recalling the definition of the fractional edge boundary ([3D, and substituting the explicit ex- 
pression for C(-), gives ([TO]) . 

The inequality ([TO]) , together with the second part of Theorem 11.41 provide an asymptotic 
solution to the Faber-Krahn problem for the Hamming cube, at least in the range of interest to 
the coding theory. It turns out that, up to an error which becomes negligible as the dimension 
n grows, Hamming balls of radius < r < ^ — o{n) are the sets with the smallest fractional 
boundary (fundamental tone) for their size. 

From the viewpoint of coding theory, this implies that Theorem 11.11 cannot lead to an im- 
provement on the best currently known bounds for binary codes |17] . Let us briefly discuss 
one implication of this fact. The bounds in [17j are obtained following Delsarte's linear pro- 
gramming approach, and there are claims in coding theory ([2l|T8]) which seem to indicate that 
these are the best bounds attainable with this approach. Since Theorem 11.11 is derived within 
the same linear programming framework, inequality (jlOp may be interpreted as an additional 
evidence in this directiorO. It seems worthwhile to point out that, in this manner, the coding 
theory provides both the question prompting this investigation and an indication of what the 
answer might be, by suggesting the putative optimality of Hamming balls for the Faber-Krahn 
problem. 

Theorem 1.4: 



and it is not, in this sense, very surprising. 



Let H{x) = — xlogx — (1 — x) log(l — x) be the entropy function. Then for any subset A 
of {0, 1}" holds 




(10) 




• On the other hand, let B be a Hamming ball. Then 
\d*B\ < 4n I - - 

The second part of this theorem is due to [8]. It follows from ([5]) and the fact that the 
cardinality of a Hamming ball of radius r is at least exp{n-ff (-) — o(n)} |16j . 

The bound in (jlOp is not very good for balanced subsets A, for which -HSJ— !. ig close to log 2. 
For instance, for |^| = 2"~^, the bound gives \d*A\ > log 2. On the other hand, it is not hard 
to see that the correct bound in this case is \d*A\ > 1. Equality is attained on a subcube of 
co-dimension one (but not on a Hamming ball of radius n/2). 

On the other hand, (jlOp is interesting, as long as the error term o(l) in the second part of 
Theorem 11.41 is of order lower than that of the main term. This error term is of order n^^'^, 
up to poly-logarithmic termq^- Therefore, the theorem provides a satisfactory lower bound for 
the fractional edge boundary size of A as long as 



loglAI 

-log 2 



n 



> n U-i/2 



In particular, Hamming balls of radius at most n/2— (n^'^) have (asymptotically) the smallest 
fractional edge-boundary for their size. 

Questions: 

• How small can the fractional edge boundary size of a balanced subset A of {0, l}" be? 

• Which balanced sets have the smallest fractional edge-boundary? 

Finally, let us briefly mention an additional application of inequality d?]), of a somewhat 
different nature. The 'standard' logarithmic Sobolev inequality ([2]) is used in [9] to derive a 
hypercontractive inequality for functions on the discrete cube (see also [H [S]). In order to 
describe this inequality, let {ws}s^{o,i}" be the character basis in the space of real-valued 
functions on the Hamming cube. Let < t < 1 and let T = Tj be a linear operator taking a 
function / = Y.sf(.S)ws to Tf = E5*'^'/(^)^s. Then ([9]) 

\\Tfh < ll/lli+*^ (11) 



'''This could be derived from the computation in \E\, or, ahernatively, from the estimates on the minimal roots 
of Krawchouk polynomials [71118]. 



Substituting ([7]) instead of ([2]) in the proof in [9] leads to a modified version of pT]) . It 
turns out that the exponent 2 on the right hand side of the inequahty can be replaced by a 

function e(p), depending on p = ^ — ^u ■ The function e{p) is decreasing, with e(0) = 20 
This modified inequality might be useful in coding theory, following the applications of (|lip in 

mm- 

2 The proof of Theorem [Q] 

Let us start with a brief overview. The main goal of this section is to prove the logarithmic 
Sobolev inequality ([7]). Our proof follows the outline of the proof of Q in [9j. We will prove 
an inequality (|13p . which will imply ([7]) as a corollary, first for the base case n = 1, and then 
for general n, using subadditivity of entropy. Compared to [9j, we need to prove a bit more 
for the base case (see Remark 12.21 below) and to carry this additional information along to the 
general case. This seems to complicate things somewhat, making it necessary to go through 
the intermediate inequality (|13p . 

We will prove the second claim of the theorem, on the properties of the function C{p) in 
([7]), along the way, in Lemma |2. II 

We may and will assume from now on that we deal only with nonnegative functions on 
{0, 1}", since substituting |/| instead of / in ([7]) decreases the left hand side and does not affect 
the right hand side. 

Several functions on the real line play an important role in the proof. We start by defining 
these functions and stating some of their properties. Let ip be defined on [0, 1] by 

^(t) = ^(1 - Vtf log(l -Vif + ^{1 + yftf log(l + Vtf - (1 + t) log(l + t) (12) 

In other words, ip{t) = Ent (/^), where / is a function on {0, 1} with /(O) = 1 — -v/i, /(I) = 

i + Vt. 

The following main technical lemma lists the relevant properties of tp and several derived 
functions, including the function C. The proof of the lemma is rather long and is postponed 
till the next section. 

Lemma 2.1: 

1. The function ip is strictly increasing and concave on [0,1], taking this interval onto 
[0,2 log 2]. This allows us to define the inverse function (p = 'ip~^ . This is a strictly 
increasing convex function taking [0,2 log 2] onto [0,1]. 



2. The function ip(t)/{l + 1) is strictly increasing and concave on [0, 1], taking this interval 
onto [0,log2]. This allows us to define the inverse functio 
strictly increasing convex function taking [0,log2] to [0,1]. 



onto [0,log2]. This allows us to define the inverse function a{t) = ( yVI ) • This is a 



^Hence, for functions with high entropy, this gives a strengthening of (fTTjI . 



3. The function c{t) = f/^il^Qm') ^-s strictly increasing and convex on [0,log2], taking this 
interval onto [2, 2/ log 2] |J 

4- The function c{t) has an explicit representation 



c{t) = ^^.l^-^H-^{log2-t){l-H'^ilog2-t))y 

In other words c = C, where C is the function in ^. 

Note that the second claim of Theorem 11.21 follows from the third and the fourth claims of this 
lemma. 

Remark 2.2: We mentioned a difference between the proof of the base case n = 1 here and 
in [9]. Let us give some details. In [9], ([2]) is shown for {0, 1}, which, in our notation, amounts 
to proving an inequality tp[t) < 2t on [0, 1]. We show, in addition, that tp is concave on [0, 1]. 
This additional convexity property turns out to be crucial for our proof. | 

Next, we introduce additional notation, and prove a simple auxiliary inequality. 

For a function / on {0,1}" let L»2(/) = E.,J2y^^{f{x) - f{y)f, and let K^{f) = \ ■ 
E^ Yuyr^xU^^) + /(y))^- Note that K'^{f) = nE/^ - \ ■ D'^{f). Note also that for a non-zero 
nonnegative function /, K'^{f) is strictly positive. 

Lemma 2.3: For a nonnegative function f on {0, 1}" holds 

Ent (f^) < 2log2 ■ K\f) 

Proof: We will prove the claim for the base case n = 1 and then use subadditivity of entropy 
to deduce it for the general case. 

The case n = 1. We may assume / is non-zero, since the claim is trivially true otherwise. 
Both sides of the inequality are 2-homogeneous, and consequently we may assume E/ = 1. 
Without loss of generality, /(O) = 1 — s, /(I) = 1 -|- s for some < s < 1. We need to show 

Ent (/2) = i; [s^] < 2 log 2 • K^{f) = 2 log 2, 

and this is indeed true by the first claim of Lemma [2. 11 since ip is increasing and "0(1) = 2 log 2. 

The general case n > 1. Let 1 < i < n be an index of a coordinate and let x £ {0, 1}". 
Fixing all the coordinates j ^ i to be Xj we obtain a copy of {0, 1}. Let /^ be the restriction 
of / to this one-dimensional cube. 

Recall ([H]) that entropy is subadditive, namely 



Y,^,Ent(ft^) >Ent{f), 



'^Here, as usual, we take c(0) = limt^o t(i+a(t)) ~ 2- 



while D'^{f) is additive, that is 

i=l 

K\f) is also additive, since EU^.K^ {ft^) = EILi E. (e (/f^)' - 1/41)2 (/f) 
nEf-l/4DHf) = K^if). 

From this and the base case, we have 

Ent {f) < Y, ^.Ent (/f )) < 2 log 2 • J^ E.,K^ (/f )) = 2 log 2 • K\f) 



Let us now pass to the main technical claim in this section. Note that the right hand side 
in (J13p is well-defined, due to the preceding lemma. 



Proposition 2.4: Let f be a non-zero nonnegative function on {0, 1}". Then 

D\f)>4-K\f)J^^^] (13) 

The proof follows the same outline. First we prove the base case n = 1. In this case, we will 
show equality 

^Ent{f)\ 



D\f) = A.K\f) 



KHf) 



Indeed, since / is non-zero, we may assume E/ = 1. This implies K'^{f) = 1. Thus we have to 
prove D^{f) = Ac/) {Ent {f)). Let < s < 1, and /(O) = 1-s, /(I) = 1 + s. Then D^{f) = As^ 
and A(j) {Ent (Z^)) = 4(/) {tp (■s^)) = 45^, verifying the base case. 

We also need to deal with a slight technicality, the case / is the zero function, because in 
the general case below, some of the one-dimensional restrictions of / might be zero. In this 
case, we formally define Ent (J^) /K'^{f) to be zero. Then (jlSp remains valid (as equality) 
in the one-dimensional case. It is easy to see that this formal definition does not affect the 
computation below. 

The general case. Let n > 1. Then, by the base case, by subadditivity of the entropy, and 
by convexity and monotonicity of (p: 



(Ent((ft^V\\ 
D\f) = Y^^^D^ (ff) = 4 ■Y^.K' (ft^^ 



i=l i=l 



V 



kHl 



e{x) 



10 



^^Hnpf-Z^J'<^'''^^^\ 



\ """W) j 



> 



^^V) ■ (^ ■t^.Ent ((/f 0')) > ^KHf) 



'Ent{f) 



I 

We proceed to derive ([7]) from (J13p . By homogeneity, we may assume E/^ = 1. Let 
< p < log 2. Consider the functional 



R[f] 



Ent (/2) 

where / ranges over the non-empt}o compact set of nonnegative non-zero functions satisfying 
E/^ = 1 and Ent (/^) > pn, and the associated minimum 

m{p) = mini?[/] 
To complete the proof of ([7]) and of Theorem 11.21 we will show 

m{p) > dp), (14) 

where c is the function defined in the third part of Lemma l2.1[ 

Indeed, fix p and let m = m[p). Let / be a function at which R[f] attains its minimum, 
that is: E/^ = 1, Ent [f] > pn and D'^{f) = niEnt {f). Then 

D^{f) = mEnt {f) > nipn. 

This means K^{f) = nEf - l/iD^{f) < (1 - (mp)/4) n, and therefore ^^^ > i^- 

Recall <J3 is an increasing convex function on [0, 2 log 2] with 0(0) = 0. Therefore the function 
T(y) = 4'{y)/y is increasing. This implies 

^ (Ent (/2) ^^ _ Ent {f)_^ f Ent {f) \ ^ Ent (f) ^ ( 4p 



^\ K^if) J K^f) \ K^f) J- K^f) \4-mp 

and, by Proposition 12.41 

which means 

4p 



7TT, > 4t 



4 — nip 



'Recall Ent (f) > Ef log ||j ([?]). 

11 



The rest is simple algebra. Recall r = 4>{y)/y and ip = cf) ^. Since ip is increasing, the last 
inequality is equivalent to 



4 — mp J 4 — mp 

Substituting y = ^™^ , this translates io ilj{y) / {y + \) > p. Recall a = ( fj^ ) • Since a is 
increasing, we obtain 

-. > a P) 

4 — mp 

This is the same as 

4a(p) 
m > 



p{a{p) + 1) 
This is equivalent to (|14p , completing the proof of ^ and of Theorem 11.21 | 

3 Proof of Lemma 12.11 

Let 

hit) = ^(1 - tf log(l - tf + 1(1 + tf log(l + tf - (1 + t^) log (1 + t^) . (15) 

In other words, h{t) = ifj (t^) . We will start with some useful properties of the function h. 

Lemma 3.1: 

1. 

h' >h>0 

2. 

[l-t')h'> th" 

Proof: We have h' [t) = 2 • ((1 + t) log(l + t)-{l-t) log(l - t) - tlog (l + t^)) and h" [t) = 

T^-21og(i±|l). 

The first claim of the lemma is easy. Nonnegativity of h follows from nonnegativity of ip, 
and 

h'{t) - h{t) = (1 - t^) log(l + t) _ (1 _ t)(3 _ t) log(l - t) + (1 - t^) log (1 + f2) > 
The second claim is somewhat harder. We have, rearranging and simplifying: 



\-{{l-t-)h'- th") = t^ log (i±^) + (1 - t^) log (i±^) - - 



l + t\ 2t 



12 



That is, we need to show 



(l + ^^)^^log([^) + (l-t^)log([^)>2. 



In fact, an even stronger inequahty 



t^ 



.,(l±|),(..,),„,(i±i),. 



is vahd for t G [0, 1]. This is easy to check for t = 1. To see this for < t < 1, recall that for 
— 1 < X < 1 holds log f j^ ] = 2 J2T=o 2fc+i • Substituting these series in the inequality above, 
we need to show 

t^.yl — + h-t^).yl — >t, 

fc=0 fc=0 

and this is easily verified by observing that all the higher coefficients of the power series on the 
left hand side are nonnegative. I 

We pass to the proof of Lemma 12.11 

Claim 1 

We will prove ip is concave by showing tjj" is negative on (0, 1). We have for < t < 1: 

The last inequality follows from the second claim of Lemma 13.11 

To see that if: is increasing, it suffices to verify ijj' >^ aX 1. Indeed, V''(l) = \h'{l) = log 2 > 
0, completing the proof of Claim 1. 

Claim 2 

Let i{t) = ^. We win verify ^' > on (0, 1). Indeed, 

, (i + t)V^'-V 



i' 



{l+tf 



so we need to check that tp' > j^. Substituting ip{t) = h{y/t), this amounts to checking 

h'iV-t) > ^ • HV-t) 

Since j^^ < 1 on (0, 1), it suffices to show h' > h, which is true by the first claim of Lemma [3. 11 
To show concavity of ^, we will prove that 

- e" > 2^, (16) 
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implying ^" < on (0, 1). Direct calculation gives 

^ ^'^ = OTTp 

Simplifying, —^" > 2^' reduces to 

2fV > 2t(l + t)ij' + (1 + tfip" 

Since ip" < we have (1 + t)'^ip" < Attp" . Therefore, it suffices to prove 

ip>{l + t)i;' + 2< 

Again, since ip is concave with ip{0) = 0, we have ip > tip' . Therefore, we only need to prove 

-2'ip" > ip' 

Writing this in terms of h, this is equivalent to (l — s^) h' > sh" which is given by the second 
claim of Lemma 13.11 

Claim 4 

We will show this claim before the third claim of the lemma. That claim is somewhat more 
involved, and its proof is relegated to the end of this section. 

Here we need to verify 

j^%. = l-VH-Hlog2-t){l-H-Hlog2-t)) 
i + a[t) 2 

for all t G [0,log2]. This is equivalent to 



H-Hlog2-t)=^--J^(l " \-(^-V^) 



2 



2 y 1 + a V l + aj 2(1 + a) 

Recall that a = a{t) is defined to satisfy t = -fj^, where ip{a) = Ent (/^), and / is a function 
on {0, 1} with g{0) = 1 — i/a, g{l) = 1 + ^/a. It is not hard to verify the identity 

1 + a y 2(1 + a) J 

for all a G [0, 1], and we are done. 

Claim 3 

First, we show that c is increasing. Direct computation gives that c' is positive on (0, log 2) 
iff ta' > a + a'^ on this interval. Both sides of this inequality are at 0, and we compare 
derivatives, that is, show ta" > 2aa' . 

Since a is convex with a(0) = 0, we have a < ta' in the interval. Hence, it suffices to show 
a">2{a'f. 
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Recall a = ^~^. Consequently, a'{S,{t)) = 717^), and a"{^{t)) = — tttttnTt- Therefore, a" > 

2 (a') is equivalent to — ^" > 2^', which is given by (J16p . 

It remains to show that c is convex. This turns out to be significantly harder than the other 
proofs in this Section. We provide a somewhat sketchy argument below. 

Direct computation shows that c" > on (0, log 2) iff 

^2(1 + a)a" + 2a(l + af > 2t^ [a'f + 2t(l + a)a (17) 

First, we rewrite this inequality in terms of ^ = a~^ . Let t = ^(x), that is a{t) = x, a'{t) = tttjy, 



a"{t) = — ,ci,Vs'i ■ Substituting and simplifying, one gets 



-(1 + x)ee + 2x(l + xf (ef > 2^2^' + 2(1 + x)C (0' , 
which has to hold for all x in (0, 1). 

Next, we rewrite this in terms of "0 = (1 + x)^, obtaining 

(1 + x)-^^ (-<) > 2 ((1 + x)ij' --^f {'^- xV-') 

Note that all the expressions in the brackets are positive, since V' is concave and ^ is increasing, 
as we saw in the proofs of Claims 1 and 2 above. We simplify this inequality, replacing ip with 
xtp' on the left hand side and in the first term on the right hand side, and arriving to the 
stronger inequality 

x(l + x)^ {-tij") > 2-0' {tp - xt/j') 

We rewrite this in terms of the function h, defined in (jlSp above. As in the proof of Claim 1, 
expressing ■0 and its derivatives in terms of h, leads to the following equivalent inequality: 

2x{h')^ > (3 - x^) hh' + X (1 + x^) hh" (18) 

From now on we concentrate on the proof of (jlSp . It will be convenient to write h and its 
derivatives in terms of two new functions Li(x) = log j^ and L2(x) = log jz^- Recalling the 
expressions for h and its derivatives (as in the proof of Lemma |3. II above, we have 

• h{x) =2xLi - (1 + x2)L2 

• h'{x) = 2Li - 2xL2 

• /i"(x) = ^-2L2 

Rewriting (jlSp in terms of Li and L2, and simplifying, one arrives to 

(3 - x^) (1 + x^) L1L2 + 2x (1 + x^) L2 > 2x (1 - x^) lI + ixLl + ix'^Li 

We expand both sides of this inequality as power series for x G (0,1). Recall that Li{x) = 
'^J2T=o ^\k+i ' ^^^' consequently, L2{x) = 2^^q ^\k+i • Therefore, both sides of this 
inequality have only odd terms. 

15 



Let the left hand side be equal to 

oo 

fe=0 
and the right hand side be equal to 



oo 

2k+l 



G{x) = 4 • ^ r2k+i 



k=0 

We will argue that 

1. All the coefficients ^2fc+i and r2k+i are nonnegative. 

2. h=ri = 0, ^3 = rs = 4 = ^5 = 4. 

3. For all odd k starting from A; = 3: 

^2fc+i > r2k+i and l2k+i + ^2k+3 > r2k+i + ?'2fc+3 
This will imply 

oo 

F{x)-G{x) = 4-^ (^2fc+i - r2k+i) x^'^+i = 4- ^ ((^2fc+i - rsfc+s) - (^2fc+3 - ^2^+3) ^') •x''=+^ > 

A;=3 odd fc>3 



4- 5^ (^2fc+i-r-2fc+2)(l-x2).2'=+i>0, 



odd fc>3 



completing the proof of (jlSp and of Claim 3. Hence it remains to prove the properties of the 
coefficients. 

In fact, the coefficients can be computed explicitly, which makes it possible to verify the 
required properties. We omit the (easy but cumbersome) details. For completeness sake, we do 
list explicit expressions for the coefficients belo\Mj. 

• For an odd k > 3: 

f 8k-20 \ ^'^^' 1 4 '^ 1 

^*^+^ V(2A;-3)(2A; + 1) J ' ^ 4m - 3 ^ 2k - 1 ' ^ 2m + I ^ 

^ ^ ' ^ ' ^ m=l m=l 

_2_ \\ '^''sp^ 1 /I 3 6 \ 

l"^2A;-l 2A;-3y ^^ 2m - 1 "^ \k ^ k{2k + I) ^ {2k - l){2k + 1) ) 



2k + 
and 

'^2fc+l 



2/C + 2 2 '' ^ 



k(2k-l) k(k-l) 

m=l 



1) ^^ 2m-l 



'Our apologies to the reader. 
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For an even k > 4: 

_( 8k-20 \ '^ 1 4 ^ 

^''^^ \i2k-3){2k + l)) ' ^2m + l ^ 2k-l' ^4m-3 ^ 



l)j"^2m + l "^ 2A;-l"^4m- 

^ m=l m=l 

^ l__\ ^'"sp^ 1 / 1 6 lOfc - 1 \ 

l^2k-l 2k-3)' ^ 2m-l^ \k-l^ {2k-l){2k + l)^ {k-l){2k-l){2k + l)) 



2k + 

and 

_8 ^ 1 2fc + 2 _ 2 ^ 1 

m=l m=l 

It remains to compute the coefficients for k = 1,2. This is easily done directly, verifying the 
property 2 above. 
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